Shared Memory NUMA Programming on I-WAY

نویسندگان

  • Jarek Nieplocha
  • Robert J. Harrison
چکیده

The performance of the Global Array shared-memory nonuniform memory-access programming model is explored on the I-WAY, wide-area-network distributed supercomputer environment. The Global Array model is extended by introducing a concept of mirrored arrays. Latencies and bandwidths for remote memory access are studied, and the performance of a large application from computational chemistry is evaluated using both fully distributed and also mirrored arrays. Excellent performance can be obtained with mirroring if even modest (0.5 MB/s) network bandwidth is available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Programming Interface for NUMA Shared-Memory Clusters

We describe a programming interface for parallel computing on NUMA (NonUniform Memory Access) shared memory machines. Although the interest in this architecture is rapidly growing and more and more hardware manufacturers offer products of this type, there is still a lack in parallelization support. We developed SMI, the Shared Memory Interface, and implemented it as a library on an SCI-coupled ...

متن کامل

Performance Prediction and Evaluation of Parallel Processing on a NUMA Multiprocessor

Non-Uniform Memory Access (NUMA) architectures make it possible to build large-scale shared memory multiprocessor systems in comparison with non-scalable UniformMemory Access (UMA) architectures. Most NUMA multiprocessor operations such as scheduling and synchronizing processes, accessing data from processors to memory models and allocating distributed memory space to di erent processors, are p...

متن کامل

Global Management of Coherent Shared Memory on an SCI Cluster

| The I/O-based implementations of the SCI standard allow cost-eecient use of shared memory on a wide range of cluster architectures. These implementations have typically been used for message-passing interfaces, but we are exploiting the use of I/O based SCI as a way to create NUMA architectures with commodity components. A major issue is that data placement and especially data consistency bec...

متن کامل

Shared Memory Parallelization of the GROMOS96 Molecular Dynamics Code

This paper describes the parallelization of a commercial molecular dynamics simulation code, GROMOS96, on a SCI (Scalable Coherent Interface) interconnected PC cluster. The underlying programming model is that of shared data structures, exploiting SCI’s capabilities of enabling access to segments of remote memory in an entirely transparent way. Methodologies are elaborated that allow to obtain ...

متن کامل

A Tool Environment for Efficient Execution of Shared Memory Programs on NUMA Systems

One of the most important performance issues on NUMA systems is data locality since remote memory accesses have latencies several magnitudes higher than local memory accesses. This paper presents a tool environment targeting at tuning NUMA-based shared memory applications towards better memory locality. This tool environment comprises tools, supporting system facilities, and their interface. To...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996